A new approach to syntactic annotation
نویسندگان
چکیده
Abstract Many systems have been developed for creating syntactically annotated corpora. However, they mainly focus on interface usability and hardly pay attention to knowledge sharing among annotators in the task. In order to incorporate the functionality of knowledge sharing, we emphasized the importance of normalizing the annotation process. As a first step toward knowledge sharing, this paper proposes a method of system initiative annotation in which the system suggests annotators the order of ambiguities to solve. To be more concrete, the system forces annotators to solve ambiguity of constituent structure in a top-down and depth-first manner, and then to solve ambiguity of grammatical category in a bottom-up and breadth-first manner. We implemented the system on top of eBonsai, our annotation tool, and conducted experiments to compare eBonsai and the proposed system in terms of annotation accuracy and efficiency. We found that at least for novice annotators, the proposed system is more efficient while keeping annotation accuracy comparable with eBonsai.
منابع مشابه
Annotation in Architecture: A Systematic Approach toward Mobilization and Development of Theoretical, Research, and Critical Basis in Architecture
Annotations usually refer to marginal notes that explain a difficult or ambiguous subject, provide a general definition or a critical remark for a particular part of a text. Historically, annotating was a well-known tradition in Islamic sciences and was used especially in times when there were less new potentials for generating new knowledge. The main question of this research is, can the tradi...
متن کاملAn annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies
A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...
متن کاملA Common XML-based Framework for Syntactic Annotation
It is widely recognized that the proliferation of annotation schemes runs counter to the need to re-use language resources, and that standards for linguistic annotation are becoming increasingly mandatory. To answer this need, we have developed a framework comprised of an abstract model for a variety of different annotation types (e.g., morpho-syntactic tagging, syntactic annotation, co-referen...
متن کاملA Common Framework for Syntactic Annotation
It is widely recognized that the proliferation of annotation schemes runs counter to the need to re-use language resources, and that standards for linguistic annotation are becoming increasingly mandatory. To answer this need, we have developed a representation framework comprised of an abstract model for a variety of different annotation types (e.g., morpho-syntactic tagging, syntactic annotat...
متن کاملA Common XML-based Framework for Syntactic Annotations
It is widely recognized that the proliferation of annotation schemes runs counter to the need to re-use language resources, and that standards for linguistic annotation are becoming increasingly mandatory. To answer this need, we have developed a framework comprised of an abstract model for a variety of different annotation types (e.g., morpho-syntactic tagging, syntactic annotation, co-referen...
متن کاملParallel Entity And Treebank Annotation
We describe a parallel annotation approach for PubMed abstracts. It includes both entity/relation annotation and a treebank containing syntactic structure, with a goal of mapping entities to constituents in the treebank. Crucial to this approach is a modification of the Penn Treebank guidelines and the characterization of entities as relation components, which allows the integration of the enti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006